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Rab GTPases and their effectors facilitate vesicular 
transport by tethering donor vesicles to their respective 
target membranes. Rab9 mediates late endosome to 
trans-Golgi transport and has recently been found to be 
a key cellular component for human immunodeficiency 
virus-1, Ebola, Marburg, and measles virus replication, 
suggesting that it may be a novel target in the develop- 
ment of broad spectrum antiviral drugs. As part of our 
structure-based drug design program, we have deter- 
mined the crystal structure of a C-terminally truncated 
human Rab9 (residues 1-177) to 1.25-A resolution. The 
overall structure shows a characteristic nucleotide 
binding fold consisting of a six-stranded /3-sheet sur- 
rounded by five a-helices with a tightly bound GDP mol- 
ecule in the active site. Structure-based sequence align- 
ment of Rab9 with other Rab proteins reveals that its 
active site consists of residues highly conserved in the 
Rab GTPase family, implying a common catalytic mech- 
anism. However, Rab9 contains seven regions that are 
significantly different in conformation from other Rab 
proteins. Some of those regions coincide with putative 
effector-binding sites and switch I and switch II regions 
identified by structure/sequence alignments. The Rab9 
structure at near atomic resolution provides an excel- 
lent model for structure-based antiviral drug design. 



Rab proteins, the largest subfamily of the Ras-like small 
GTPase superfamily, serve as molecular switches mediating 
tethering, docking, fusion, and motility of intracellular mem- 
branes (1). Rabs cycle between GTP-bound (on/active) and 
GDP-bound (off/inactive) conformations (2-4). The active form 
is stabilized by additional hydrogen bond interactions with the 
-y-phosphate of GTP mediated by serine residues in the phos- 
phate-binding loop and switch I region as well as an extensive 
hydrophobic interface between the switch I and II regions (5, 
6). The inactive conformation usually has displaced and mobile 
switch regions (3, 7). Biochemical and genetic studies of chi- 
meric and mutant Rab proteins have identified several hyper- 
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variable regions, including the N and C termini and the «3//35 
loop, that play an important role in determining functional 
specificity (8, 9). 

The Rab9 GTPase is localized predominantly to late endo- 
somes and is required for the transport of mannose 6-phos- 
phate receptors from endosomes to the trans-Golgi network 
(10, 11). Rab9 facilitates vesicular transport by pairing with its 
cognate Rab effector P40 (12). Rab9 also interacts with the 
vesicle cargo selection protein TIP47, which has been shown to 
bind the cytoplasmic tail of the HIV 1 envelope glycoprotein 
subunit gp41 (13). By targeting Rab9 mRNA for degradation 
with small interfering RNA, Rab9 has just been identified to be 
a key cellular component for HIV-1, Ebola, Marburg, and mea- 
sles virus replication, 2 suggesting that inhibitors of Rab9 func- 
tion, if developed, might prove useful in the control of those 
viruses. As part of a new structure-based antiviral drug design 
program, we have determined the crystal structure of a C- 
terminally truncated human Rab9 (residues 1-177) to near 
atomic resolution of 1.25 A. 

EXPERIMENTAL PROCEDURES 

Cloning and Expression — The gene for human Rab9 (GenBank™ 
accession number NM_004251) was obtained from the IMAGE clone 
collection (IMAGE ID number 4139714) through distribution by Open- 
Biosystems. A C-torminally truncated fragment coding for residues 
1-177 (20.1 kDa) was PCR subcloned using primers 5'-G ACA GCT 
AGC ATG GCA GGC AAA TCA TCA CTT TTT AAA G-3' and 5'-C ATG 
GAT CCT TCA GTC CTC GGT AGC AAG AAC TCT TC-3' into the 
Nhel/BamHI restriction sites of pET28b (Novagen); the resulting con- 
struct encodes for a Rab9-(1-177) protein product with an N-terminal 
His 6 -containing fusion (MGSSHHHHHHSSGLVPRGSHMAS). The 
P ET28-Rab9-(1-177) vector was transformed into Escherichia coli 
BL21(DE3) (Novagen), and overproduction of the fusion protein was 
induced at an A 600nm of -2.0 with 1 mM isopropyl-l-thio-/3-D-galacto- 
pyranoside at 310 K for 2 h; the cells were harvested bv centrifugation 
and frozen at 253 K. 

Protein Purification— The cell pellet was resuspended in nickel 
buffer A (20 mM Tris, pH 8.0, 500 mM NaCl, 5 mM imidazole), lysed by 
sonication, and centrifuged at 20,000 X g for 20 min at 277 K. The 
soluble fraction was filtered through a 0.45-micron filter and applied to 
chelating Sepharoso (Aniershani Biosciences), which had been previ- 
ously charged with 50 mM NiS0 4 and equilibrated with nickel buffer A. 
The column was then washed with nickel wash buffer (20 mM Tris-HCl, 
pH 8.0, 500 mM NaCl, 55 mM imidazole), and the His 6 -Rab9-(1-177) 
fusion protein was eluted with nickel elution buffer (20 mM Tris- II CI, 
pH 8.0, 500 mM NaCl, 350 mM imidazole). The His 6 -Rab9-(1-177) fusion 
protein was then dialyzed against Thr buffer (20 mM Tris-HCl, pH 8.4, 



1 The abbreviations used are: 1 1 IV, human immunodeficiency virus; 
MES, 4-morpholineethanesulfonic acid; r.m.s., root mean square; 
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diphosphate-monothiophosphate. 
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46 N K D L E V D G HFVTMQIWDT A G Q E R FRSLRTPFY R G S D C C L L T F S V D 
B2 B3 H2 B4 

91 D SQSFQNLSNWKKEFIYYAD V K E P E S F PFVILGNK IDISERQV S_T 
H3 B5 

136 EEAQAWCRDN G D Y P Y F E T S A K D A T NVAAAFE EAVRRVLAT E D R S D 

181 HLIQTDTVNLHRKPKPSSSCC 
FlG. 1. Amino acid sequence of human Rab9 and assignment of its secondary-structure elements. cv-Helix and /3-sheet regions defined 
by I lie crystal si n let lire arc underlined and labeled (a-helices starting with H and /3-strands with B). 



FlG. 2. Stereoview of the Ca trace of 

Rab9. Every lUth residue and both N and 
C termini are labeled. Residues 1, 35 38, 
and 176-177 are missing from the refined 
model and are not shown. 




Data and phasing statistics 
Space group 

Unit cell: a, b, c (A), a, 0, 

y(°) 

Wavelength (A) 
Resolution (A) 
Reflections (total/unique) 
Completeness (%) 



38.40, 45.62, 51.22, 99.7, 107.2, 101.8 

1.00 
1.25 

507,012/73,993 
84.7 (39.9)° 
25.5 (2.3)" 
6.3 (39.2)" 
Rablla dimer 



Number ofreflec 



Protein/GDP/water, 2662/56/508 
i? work /i? free , 13.9/19.6 
Length/angle distance, 0.013 A/0.032 A 
e for the highest resolution shell. 



150 mM NaCl, 2.5 m« CaCl 2 ) at 277 K, and precipitate was removed by 
centrifugation at 20,000 X g for 20 min at 277 K. To the soluble fraction, 
1 unit of thrombin protease (Novagen) was added per milligram of 
fusion protein, and the His tag was removed by digestion for 4 h at 298 
K (thrombin cleavage results in a Rab9a-(1-177) protein with an N- 
terminal GSHMAS extension). The thrombin cleavage reaction was 
diluted ( 1 :3, v/v) with 20 mM MES, pH 6.5, and applied to Q-Sepharose 
(Amersham Biosciences), which had been previously equilibrated with 
Q buffer A (20 mM MES, pH 6.5, 50 mM NaCl). Native Rab9-( 1-177) was 
eluted from Q-Sepharose with a 50-750 mM NaCl linear gradient in 
MES, pH 6.5; fractions containing native Rab9-(1-177) were ident ified 
by denaturing gel electrophoresis and pooled. The pooled Q fractions 
were then further purified by gel filtration on Sephacryl S-200 (Amer- 
sham Biosciences) in MES, pH 6.5, 150 mM NaCl; fractions containing 
Rab9-(1-177) were pooled and concentrated by ultrafdtration. 

Crystallization and Data Collection — The stock protein solution used 
Cor crystallization contained 20 mM M US buffer, pi I 6.5, and 150 ui\l 




A rainbow 
trace from the N 

to the C terminus. Both termini and the secondary st met lire 
elements are labeled. GDP in the active site is shown in ball -and- stick 
formation. Gray, carbon moms: blue, nitrogen atoms; red, oxygen at- 
oms; purple, phosphorus atoms. 

sodium chloride u ith a protein concent rat ion of 10 mg/ml. Crystals were 
grown at 277 K by the hanging-drop vapor diffusion method with 100 
mM sodium acetate buffer, pi I 5.0, 5' i (v/v) polyethylene glycol 4000 as 
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Fig. 4. Stereoview of the active site 
showing the interactions between 
Rab9 and GDP. Gray, carbon atoms; 
blue, nitrogen atoms; red, oxygen atoms; 
purple, phosphorus atoms. The bonds in 
GDP are colored green. 







Table II 




Hydrogei 


/ bunds between Hub!) line, 


! GDP 




GDP atoms Hydn 




Monomer A 




A 


Gly-17 N 


OIB 




( ;i l:> N 


03A 




Lys-20 N 


■ >_ r. 


2.85 


Lvs-20 NZ 


02 B 


2.81 


Ser-21 N 


03 B 


2.91 


Ser-22 N 


OlA 




Ser-22 OG 


Ol A 


2x1 


Thr-34 OG1 




3.02 


Thr-34 OG1 


02° 


3.05 


Lys-125 NZ 


04° 


3.04 


Asp-127 OD1 


Nl 


2.82 


Asp-127 OD2 


N2 


2., S3 


Ala-155 N 


06 


2.86 


Monomer B 






Gly-17 N 


OIB 




Gly-19 N 


03A 


3.13 


Gly-19 N 


02 B 


3.02 


Lys-20 N 


02B 


2.88 


Lys-20 NZ 


02B 


2.75 


Ser-21 N 


03B 


2.9(i 


Ser-22 N 


OlA 




Ser-22 OG 


OlA 


2.70 


Thr-34 O 


02" 


2.50 


Asn-124 


N7 




Lys-125 NZ 


04" 


2.99 


Asp-127 OD1 


Nl 


2.72 


Asp-127 OD2 






Ala-155 N 


06 


2.82 



" Denotes atoms from sugar rings. 

crystallization solution. Crystals formed in space group Pj with a = 
38.40 A, b = 45.62 A, c = 51.22 A, a = 99.8°, (3 = 107.2°, and 7 = 101.8° 
and contained two monomers in the unit cell. X-ray diffract ion data to 
1.25-A resolution were collected at beamline 22-ID in the facilities of the 
South East Regional Collaborative Access Team at the Advanced Pho- 
ton Source, Argonne National Laboratory. The statistics for data col- 
lection and processing are summarized in Table I. 

Structure Deteriniiiiilio 11 and Refinement The orientation and posi- 
tion of the Rab9 dimer in the P 1 unit cell were determined using the 
molecular replacement protocols in Crystallography & NMI! System 
114) starting from the structure of Rablla (PDB code IOIV (15)) as the 
search model. The composite omit map was calculated to guide elec- 
tronic density fitting of the model. Energy-restrained crystallographic 
refinement was carried out with maximum likelihood algorithms im- 
plemented in Crystallography NMR software (14). Refinement pro- 
ceeded through several cycles in combination with manual checking by 
the program O (16). The addition of two GDPs and 473 water molecules 
and refinement up to 1.25 A resulted in R and i? free values of 0.213 and 
0.232, respectively. Further refinement was continued with SHELX-97 
(17) by subjecting the struct ur to cycles of isotropi 1 ujugate gradient 
least squares refinement; then tightly rostrai I anisotropic displace- 
ment parameters were introduced and refined. The final refinement 
cycle resulted in R/Rj m values of 0.139/0.196. The final model contains 
residues 2-34 and 39-175 of monomer A, residues 5-34, 39-110, and 
115-175 of monomer B, and 2 GDP molecules plus 508 water molecules. 
The phasing and refinement slat istics are summarized in Table I. 

Protein Fold Analysts — Secondary st meture elements were defined by 
the hydrogen-bonding patterns in combination with visual inspection. The 
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Fig. 5. Structure-based sequence alignment of Rab9 with the 
most similar structures. The sequence number for the first residue of 

tural elements in Rab9 are indicated by underlining the sequence and 
labeling as in Fig. 1. Residues that are highly conserved in the Rab 
CTI'ase famil y are indicated in bold/iiee type. Residues t hat are know 11 
to contact GDP in protein-( i I ) I' complex st I'ucl 11 res are indicated in red. 
Additional residues that would interact with the 7-phosphate in pro- 
tein-GTP analog complex structures are shown in purple. The seven 
regions that show high degree of conformational variation among the 
superimposed .structures (see fig. CO are highlighted in gold, and la- 
beled. See Table III Cor protein abbreviations and references. 



Dali algorithm of comparing protein domain structures by alignment of 
distance matrices was used to search for structural homologues of Rab9 
and also used for structure-based sequence alignment (18, 19). Ribbon 
diagrams were prepared by the program MOLSCRIPT (20). 

RESULTS 

Structure Determination — The human Rab9 variant we used 
for crystal structure determination included residues 1-177, 
lacking its last 24 residues (Fig. 1). Known as the C-terminal 
hypervariable region, the amino acid sequence of this region in 
Rab9 is poorly conserved with respect to other Rab proteins. 
Therefore, we excluded the C-terminal 24 residues from our 
cloning and crystallographic studies. We will refer below to this 
truncated form of the protein as Rab9. 

Rab9 bound to GDP was crystallized, and its structure was 
determined by molecular replacement (Table I). The structure 
was refined against 1.25-A resolution data, making it one of the 
highest resolution structures in the Rab protein family. Both N 
and C termini (residues 1 and 176-177 of monomer A and 
residues 1-4 and 176-177 of monomer B) and some loop re- 
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Table III 

Alignment statistics of Ral>9 with si in, 1 11 rally similar proteins 
■<• produced using the Dull' a Igerit hu i (IS, 19). 



Protein Data Bank co 



GDP-bound Rab9 
Gpp(NH)p-bound Ypt7p 
Cppsp-bound Rablla 
CpptN I Dp-bound p21l{as 
GDP-bound Rablla 
GDP-bound Ypt7p 
Gpp(NH)p-bound Sec4 
CpplN I Dp-bound Rab5a 
CpplN I Dp-bound Rab5c 
CDP-bound Sec4 
Gpp(NH)p-bound Ypt51 



IWMS 
1KY2 
lOIW 
1CTQ 

iorv 

1KY3 
1G17 
1N6H 
1HUQ 
1G16 
1EK0 



24.9 
24.7 
24.2 
24.0 




" Z-score, strength of structural similarity in standard deviations above expected. 
b Positional root mean square deviation of superimposed Ca atoms in A. 

Total number of equivalent residues. 
d Length of the entire chain of the equivalent structure. 
"Percentage of sequence identity over equivalent positions. 



Fig. 6. Structure comparison. A, ste- 
rcovieu of str uctural alignment of Rab9 
w ith three GITases. The Ca backbone in 
GDP-bound Rab9 (black, residues 2-34 
and 39-175) is superimposed with GDP- 
bound Ypt7p (green, residues 7-37, 41- 
46, 77-181, and Protein Data Bank acces- 
sion number 1KY3), GDP-bound Rablla 
(blue, residues 6 173, and Protein Data 
Bank accession number 10IV), and Gp- 
p(Nll)p-bound p21Ras (red, residues 
1-166, Protein Data Bank accession num- 
ber 1CTQ). For clarity, only the GDP mol- 
ecule in the active site of Rab9 structure 
is shown and colored black. Both N and C 
termini ol Rab9 are labeled. H, stereoview 

of linb'.) backl c structure highlighting 

the seven hypervariable regions show n In 
yclloir while the rest are shown in black. 
Both termini and the secondary structure 
elements are labeled. GDP in the active 
site is shown in ball-and-stick formation. 
Cray, carbon atoms; blue, nil rogen atoms; 
red, oxygen atoms; purple, phosphorus 




gions (residues 34-38 of both monomers and residues 111-114 
of monomer B) were disordered and could not be seen in the 
experimental electron density map. The final refined model, 
which includes residues 2-34 and 39-175 of monomer A, res- 
idues 5-34, 39-110, and 115-175 of monomer B, 2 GDP mole- 
cules, and 508 ordered water molecules, has a working R value 
of 0.139 and a free R value of 0.196. The stereochemistry is 
excellent with r.m.s. deviations for bond lengths and angle 
distances of 0.013 A and 0.032 A, respectively (Table I). The 
Ramachandran plot statistics showed that 93.2% of the back- 
bone dihedral angles were in the most favored regions, 6.8% in 
the additional allowed regions, and none of the non-glycine 



residues were in the disallowed regions. The two crystallo- 
graphically unique Rab9 molecules in the crystal unit cell have 
almost identical structures with the r.m.s. deviation between 
the 161 equivalent Ca atoms of 0.40 A. We will use monomer A 
in our description of Rab9 structure. 

Overall Structure ofRab9 — Like other members of the Rab 
GTPase family, Rab9 adopts a classical nucleotide binding fold 
consisting of a six-stranded /3-sheet surrounded by five a-heli- 
ces (Figs. 2 and 3). The five a-helices (H1-H5) and six j3-strands 
(B1-B6) connect with a B1-H1-B2-B3-H2-B4-H3-B5-H4-B6-H5 
topology containing 30.5% (54/177) a-helix, 28.8% (51/177) 
0-sheet, 37.3% (66/177) turn/loop, and 3.4% (6/177) other (Figs. 



EXHIBIT B 

High Resolution Structure of Rab9 GTPase 



40208 

1 and 3). The six /3-strands arrange in the order of B2-B3-B1- 
B4-B5-B6, forming a mostly parallel j3-sheet except that B2 is 
antiparallel to the rest in strand direction. 

Active Site Structure — The crystal structure reported here 
contains a tightly bound GDP molecule in the active site (Figs. 
3 and 4). Rab9 residues making hydrogen bonds with GDP 
include Gly-17, Gly-19, Lys-20, Ser-21, Ser-22, Thr-34, Asn- 
124, Lys-125, Asp- 127, and Ala- 155 (Table II, Fig. 4, and as 
shown by the color red in Fig. 5). These residues together with 
Gly-16, Val-18, Phe-32, He- 126, and Lys-156 form a tight bind- 
ing pocket accommodating the GDP molecule. There are more 
than a dozen hydrogen bonds formed between Rab9 and GDP, 
indicating a strong binding affinity of Rab9 for GDP. This is 
supported by the observation that the GDP molecule has been 
kept inside the active site throughout Rab9 purification, that is 
it comes naturally bound to Rab9 from the cell culture. Resi- 
dues Thr-39 and Gly-65 are highly conserved in the Rab 
GTPase family and are predicted to interact with the -y-phos- 
phate of GTP in the active state of Rab9. 

DISCUSSION 

Structure Comparison — The overall structure of Rab9 is very 
similar to the prototype Ras protein p21Ras (21) and several 
Rab proteins (Table III). Among those, Rab9 has the highest 
sequence identity with Ypt7p (54% over 153 equivalent posi- 
tions), followed by Rablla (43% over 161 equivalent positions). 
The structural similarity Z-scores (18, 19) range from 26.6 to 
23.2 with r.m.s. deviations of equivalent positions in the range 
of 1.4-2.0 A. Structure-based sequence alignment reveals that 
the active site of Rab9 consists of residues highly conserved in 
the Rab GTPase family (Fig. 5), implying a common catalytic 
mechanism. However, Rab9 contains seven hypervariable re- 
gions that are significantly different in conformation from 
other Rab proteins (Figs. 5 and 6). Some of those regions 
coincide with putative effector-binding sites and conforma- 
tional switch I and switch II regions identified by earlier crys- 
tallographic studies of other Rab proteins. Regions II and IV 
correspond to the switch I and switch II, respectively, whereas 
regions I, V, and VII correspond to the three effector-binding 
sites/complementary determining regions. These seven hyper- 



variable regions in Rab9 structure may serve as sites for anti- 
viral drug binding and provide an excellent target for struc- 
ture-based drug design and development. 
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